class: title-slide <br> <br> # Effects of Early Warning Emails on Student Performance <br> .padding_left.pull-down.white[ .bold[_J. Klenke_], T. Massing, N. Reckmann, J. Langerbein, B. Otto, M. Goedicke, C. Hanck <br> <br> <br> `\(15^{TH}\)` International Conference on Computer Supported Education Prague, 21-23 April, 2023 ] --- # Outline <br> 1. [Research Idea and Course Description](#course) 1. [Literature on Warning Systems in Education](#literature) 1. [Used Model: Rregression Discounity Design](#RDD) 1. [Empirical Results](#results) 1. [Discussion](#discussion) 1. [Further Research](#f_reaserach) 1. [Appendix](#appendix) 1. [References](#references) --- name: course # Research Idea and Course Description - __Research Idea:__ students should receive objective and motivating feedback through a warning email - Analyzed Course: _Inferential Statistics_ at the University of Duisburg-Essen of the summer semester 2019 - Compulsory course for several business and economics programs - Weekly 2-hour lecture - Weekly 2-hour exercise - [Kahoot!](https://kahoot.com/) games were used to interact with students during classes - Homework and 5 online tests were offered on the e-assessment platform [JACK](https://s3.paluno.uni-due.de/en/forschung/spalte1/e-learning-und-e-assessment) - Over all systems we gathered information of __802__ individuals - __337__ students took an exam at the end of the semester --- # Data and Decision Rule <br> - We used two data sources - First three online test results - Cumulative points of the tasks in JACK - Logit model was used to predict students probability to pass the exam based on the first 3 online tests - Model was trained with the latest data obtained from previous edition of the same course - If predicted probability to pass `\(\leq 0.4\)` the student got a warning mail --- # Course Timeline Main Events <br> <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#plots/timeline_plot.png" alt="Timeline for the key events in the 2019 summer term course Inferential Statistics (treatment cohort)." width="80%" /> <p class="caption">Timeline for the key events in the 2019 summer term course Inferential Statistics (treatment cohort).</p> </div> - The shaded area indicates the period after treatment. - 57 days between the warning mail and `\(1^{st}\)` exam - 113 days between the warning email and `\(2^{nd}\)` exam --- name: literature # Literature on Warning Systems in Education .font80[ - <a id='cite-Arnold2012'></a><a href='#bib-Arnold2012'>Arnold and Pistilli (2012)</a> investigated the effect of the signal light system at the Purdue University and found a positive effect on student grades - <a id='cite-baneres2020'></a><a href='#bib-baneres2020'>Bañeres, Rodríguez, Guerrero-Roldán, and Karadeniz (2020)</a> implemented an early warning system but did not analysed the effect on students' performance - <a id='cite-csahin2019'></a><a href='#bib-csahin2019'>Şahin and Yurdugül (2019)</a> invented an _Intelligent Intervention System_ where for each assessment the students get feedback. Students emphasized the usefulness of the system. - <a id='cite-Iver2019'></a><a href='#bib-Iver2019'>Mac Iver, Stein, Davis, Balfanz, and Fox (2019)</a> could not find an effect from their early waning system in the ninth grad. - <a id='cite-Edmunds2002'></a><a href='#bib-Edmunds2002'>Edmunds and Tancock (2002)</a> analyzed the effects of incentives on third and four-graders' reading motivation and did not find an effect. ] -- <br> .blockquote[ - The literature on the effects of warning system is inconclusive - Many studies analyzed the sytstem with questionnaires .padding_left_2[<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#004c93;" xmlns="http://www.w3.org/2000/svg"> <path d="M256 8c137 0 248 111 248 248S393 504 256 504 8 393 8 256 119 8 256 8zm-28.9 143.6l75.5 72.4H120c-13.3 0-24 10.7-24 24v16c0 13.3 10.7 24 24 24h182.6l-75.5 72.4c-9.7 9.3-9.9 24.8-.4 34.3l11 10.9c9.4 9.4 24.6 9.4 33.9 0L404.3 273c9.4-9.4 9.4-24.6 0-33.9L271.6 106.3c-9.4-9.4-24.6-9.4-33.9 0l-11 10.9c-9.5 9.6-9.3 25.1.4 34.4z"></path></svg> We try to measure the effect directly on students' performance] ] --- name: RDD # RDD Example — I ## Parametric Estimation <br> <img src="data:image/png;base64,#plots/late_tikz1.png" width="80%" style="display: block; margin: auto;" /> --- # RDD Example — II ## Non-parametric Estimation <img src="data:image/png;base64,#plots/non_p_late_tikz1.png" width="80%" style="display: block; margin: auto;" /> -- - We used the data-driven approach by <a id='cite-imbensoptimal2009'></a><a href='#bib-imbensoptimal2009'>Imbens and Kalyanaraman (2009)</a> to determine the bandwidth ??? - The method fits the bandwidth as widely as possible without introducing other confounding effects --- # Model Assumptions - The running variable `\(W\)` (predicted probability to pass the exam) needs to be continuous around the cutoff, otherwise students could manipulate the treatment .pull-left-2[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#plots/test_cont_label.png" alt="Graphical illustration of the McCrary sorting test." width="80%" /> <p class="caption">Graphical illustration of the McCrary sorting test.</p> </div> ] -- .pull-right-1[ <br> - There is no jump in the density around the cutoff point of `\(0.4\)`. - `\(p\)`-value: `\(0.509\)` - The incentive to manipulate the treatment is quiet low. ] -- .pull-down[ - Also standard IV estimation assumptions must hold ] --- name: results # Empirical Results — I .pull-left-2[ <br> <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#plots/model_plot_label.png" alt="Graphical illustration of the RDD model." width="80%" /> <p class="caption">Graphical illustration of the RDD model.</p> </div> ] -- .pull-right-1[ ### Estimates - LATE: 0.193 (4.889) - Bandwidth: 0.255 - `\(F\)`-statics: 0.257 - `\(N\)`: 126 <br> - We also estimated the RDD with covariates - The results were identical ] ??? - theoretisch sollten Sie jetzt einen großen Sprung sehen --- # Empirical Results — II - The LATE estimate is positive but not significant - An estimate of `\(0.193\)` means that students who received the warning email achieved `\(0.193\)` points more than comparable students who did not - Compared to the `\(60\)`-point exam, the effect size seems limited -- - The bandwidth of `\(0.255\)` was determined with the data-driven approach of <a href='#bib-imbensoptimal2009'>Imbens and Kalyanaraman (2009)</a>. - Only students with a predicted probability `\(0.4\)` (cutoff) `\(\pm 0.255\)` (bandwidth), are included in the analysis. -- - This leads to the sample size of `\(126\)` students. --- name: discussion # Discussion — I - Our RDD results do not provide evidence that the warning email has a significant effect on students’ results (or behavior) - The variance around the cutoff is rather high, which compromises the detection of an effect - Many individuals are not included in the final analysis for several reasons - Students dropping the course - Students far away from the cutoff are not providing much information for the model .padding_left_2[<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#004c93;" xmlns="http://www.w3.org/2000/svg"> <path d="M256 8c137 0 248 111 248 248S393 504 256 504 8 393 8 256 119 8 256 8zm-28.9 143.6l75.5 72.4H120c-13.3 0-24 10.7-24 24v16c0 13.3 10.7 24 24 24h182.6l-75.5 72.4c-9.7 9.3-9.9 24.8-.4 34.3l11 10.9c9.4 9.4 24.6 9.4 33.9 0L404.3 273c9.4-9.4 9.4-24.6 0-33.9L271.6 106.3c-9.4-9.4-24.6-9.4-33.9 0l-11 10.9c-9.5 9.6-9.3 25.1.4 34.4z"></path></svg> Thus precise estimation of the treatment becomes more difficult] --- # Discussion — II - Students get also feedback through their online tests - The warning may also lead weak students to postpone participation to a later semester - The cost to postone exams are in our program quiet low - The objective feedback and motivation from one warning email is rather small --- name: f_reaserach # Further Research - The effect on the dropout rate from such warning emails or systems requires further attention - An automatic repeatedly feedback system could possibly have a greater impact on students motivation - A detailed recurring feedback could also used to guide students -- <br> .blockquote[ We see the open and transparent communication of the student’s performance to the students as a positive aspect of the system. ] --- class: appendix eg name: appendix ## Appendix: Regression Discontinuity Design (RDD) — I - The treatment is __not__ randomly assigned and therefore methods like OLS are not suitable - Treatment is a function of the predicted probability to pass the exam - Consider the following __sharp__ RDD representation <a id='cite-Huntington-Klein2022'></a><a href='#bib-Huntington-Klein2022'>Huntington-Klein (2022)</a>: `$$Y_i = \beta_0 + \alpha T_i + \beta W_i + u_i$$` .padding_left.padding_left.font80[ <ul> <li> \(W_i\) denotes the predicted probability to pass the final exam</li> <li> \(T_i\) indicates if a student received a mail</li> <ul style="list-style-type: '‣ ';" > <li>\(T_i = 1[W_i \leq c]\) , with \(c = 0.4\) </li> </ul> <li>\(\alpha\) denotes the treatment effect</li> <li>\(u_i\) denotes the error term</li> </ul> ] -- <svg viewBox="0 0 192 512" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#004c93;" xmlns="http://www.w3.org/2000/svg"> <path d="M176 432c0 44.112-35.888 80-80 80s-80-35.888-80-80 35.888-80 80-80 80 35.888 80 80zM25.26 25.199l13.6 272C39.499 309.972 50.041 320 62.83 320h66.34c12.789 0 23.331-10.028 23.97-22.801l13.6-272C167.425 11.49 156.496 0 142.77 0H49.23C35.504 0 24.575 11.49 25.26 25.199z"></path></svg> `\(\;\)` This design is not suitable for our analysis as our groups are not perfectly separated --- class: appendix eg ## Appendix: Regression Discontinuity Design (RDD) — II ### Appendix: Fuzzy RDD - __Fuzzy__ RDD allows to analyse a treatment in a setting where the two groups are not perfectly separated - Only the likelihood of receiving the treatment needs to _change_ - The effect is estimated through an instrumental variable estimation <a id='cite-angristidentification1996'></a><a href='#bib-angristidentification1996'>Angrist, Imbens, and Rubin (1996)</a> where in the first stage the `\(\widehat{T}_i\)` are estimated which then are inserted in the second stage - First Stage: `$$T_i = \gamma_0 +\gamma_i Z_i + \gamma_2 W_i + \nu_i \qquad \quad$$` - Second Stage: `$$Y_i = \beta_0 + \alpha \widehat{T}_i + \delta_1 W_i + \beta X_i + u_i$$` --- class: appendix eg ## Appendix: Regression Discontinuity Design (RDD) — III .font70[ <ul> <li>RDD compares the individuals around the cutoff to estimate the effect</li> <li><strong>Main Assumption:</strong> Individuals around the cutoff are compareable and only differ in the treatment assignment</li> <ul> <li>The estimate is called Local Average Treatment Effect (<strong>LATE</strong>) </li> </ul> <li>For both methods, sharp and fuzzy,the estimation can be either parametric or non-parametric</li> <ul> <li>Parametric estimation</li> <ul> <li>Uses the whole sample size but (many) more parameters</li> <li>Individuals away from the cutoff are less relevant for the estimation of the effect</li> </ul> <li>Non-parametric estimation</li> <ul> <li>Only <em>comparable</em> (near the cutoff) individuals are used for the analysis</li> <ul> <li>Decision whether to include an individual depends on the running variable \(W\)</li> <li>The groups are determined by the the data-driven approach of Imbens and Kalyanaraman (2009)</li> <ul> <li>With an \(F\)-test the bandwidth is determined</li> <li>\(c \pm bandwidth\) are the two groups</li> </ul> </ul> </ul> </ul> </ul> ] --- name: references # References .font80[ Angrist, J. D., G. Imbens, and D. B. Rubin (1996). "Identification of Causal Effects Using Instrumental Variables". In: _Journal of the American Statistical Association_ 91.434. Publisher: Taylor & Francis, pp. 444-455. Arnold, K. E. and M. Pistilli (2012). "Course signals at Purdue: using learning analytics to increase student success". Eng. In: _ACM International Conference Proceeding Series_. LAK '12. ACM, pp. 267-270. Bañeres, D., M. E. Rodríguez, A. E. Guerrero-Roldán, et al. (2020). "An Early Warning System to Detect At-Risk Students in Online Higher Education". In: _Applied Sciences_ 10.13, p. 4427. Edmunds, K. and S. M. Tancock (2002). "Incentives: The effects on the reading motivation of fourth‐grade students". In: _Reading Research and Instruction_ 42.2, pp. 17-37. Huntington-Klein, N. (2022). _The effect : an introduction to research design and causality_. First edition. Chapman & Hall book. Boca Raton ; London ; New York: CRC Press, Taylor & Francis Group. Imbens, G. and K. Kalyanaraman (2009). "Optimal Bandwidth Choice for the Regression Discontinuity Estimator". In: _National Bureau of Economic Research_ 1.14726. Mac Iver, M. A., M. L. Stein, M. H. Davis, et al. (2019). "An Efficacy Study of a Ninth-Grade Early Warning Indicator Intervention". In: _Journal of Research on Educational Effectiveness_ 12.3, pp. 363-390. Şahin, M. and H. Yurdugül (2019). "An intervention engine design and development based on learning analytics: the intelligent intervention system (In 2 S)". In: _Smart Learning Environments_ 6.1, p. 18. ]